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Abstract In this paper we obtain time uniform propagation estimates for systems of 
interacting diffusion processes. Using a well defined metric function h , our result guarantees 
a time-uniform estimates for the convergence of a class of interacting stochastic differential 
equations towards their mean field equation, and this for a general model, satisfying vari¬ 
ous conditions ensuring that the decay associated to the internal dynamics term dominates 
the interaction and noise terms. Our result should have diverse applications, particularly in 
neuroscience, and allows for models more elaborate than the one of Wilson and Cowan, not 
requiring the internal dynamics to be of linear decay. An example is given at the end of this 
work as an illustration of the interest of this result. 
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1. Introduction 


In this paper we obtain time-uniform estimates for the convergence of a class of 
interacting diffusion stochastic differential equations towards the associated mean 
field equation. The propagation of chaos resulting from this convergence when the 
number of particles N tends to infinity is uniform in time which means that not 
only the particles are independent of each other, but also this independence is 
reached uniformly in time. 

The A-particle interacting diffusion model is of the following form 

rt I n /•t 

Xl = x ini + / bo(Xi) + - X h(Xi, X*)ds + / b 2 (Xi)dWf (1) 

Jo v k=1 Jo 


Here {W^)j G z+ are independent Wiener Processes, Xi n i is a constant and bo,b 2 : 
R —> R, b\ : R x R —>• R, are measurable functions. We will explain further below 
our reasons for studying this type of model. For a probability measure on R, 7 , 
write b\ (x, 7 ) = f R b\{x,y)d'y{y). The limiting processes ( X J t ) are defined to be 


n = 


b 0 (xi)+b 1 (xi,fi s )ds+ / b 2 (xi)dw s 


(2) 
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where Jls is the law of Xj . The classical propagation of chaos result states that, 
under suitable conditions on 60,61 and 62 , the probability law of Xj over some 
fixed time interval [0,T] (this being a probability law on C([0,T];R)), converges 
weakly to the probability law of Xj. Refer to [l], 0, [I?} for more details. We briefly 
consider the following toy model to motivate our problem. Consider for the moment 
the system 


rt N 


y t j = Vini + ^ / E W, Y s) ds + w l 


( 3 ) 


k =1 


where 61 has Lipschitz constant b^j p . Define 


Y J — Y -A _ 

1 t _ nm t 



0 JR 


h{Y s J ,y)d/j, s (y)ds + Wi 


t ’ 


( 4 ) 


where jl s is the law of Y$. Assume that both of the above equations have str ong 
solutions. Using Gronwall’s Inequality and the Cauchy-Schwartz inequality, [171 ] 
obtained a bound of the form 


E 

sup 

< exp (2Tb 1 jJ pS ] sup E 

h(xi,Y a k ) 2 ' 


te[o,T] 

v ; se[o,T] 



It is clear from the above that Y/ is a good approximation to Y^ when NTb^j p <C 
1. It is also clear that as T —>• 00 , this bound becomes very poor, particularly 
due to the exponentiation. In much modeling of interacting diffusions, such as 
neuroscience, it is difficult to assume that T is small: indeed, often it is difficult to 
properly model the ‘start’ of a system. It is therefore desirable to obtain convergence 
results which are uniform in time. This is the focus of this paper. 

For x, y £ R, let 


h{x,y) = g(x)g(y)f(x - y), 


( 5 ) 


for some functions / > 0 and g > 1 described further below. We expect (but do 
not require) / to be of the form f(z) = f CO nstz 2k where k is a positive integer. / 
modulates the rate of convergence for when Xj is ‘close’ to Xj. g > 1 is a weight 
function which modulates the behavior for when \X J t \ or \Xj\ asymptote to 00 . If h 
is a metric, then this result guarantees that the Wasserstein Distance (with respect 
to h ) between the laws of X-[ and Xj converges to zero as N —> 00 , with a rate 
which is uniform in t. As a consequence of Theorem o and since h is a metric, 
the result guarantees that the joint law of any finite set of neurons(or particles 
in the general case) converges to a tensor product of iid processes, each with law 
given by the SDE in (|2|) . It is easily verified as explained in Corollary 13.11 To the 


best of our knowledge, the first work on uniform propagation of chaos was [141 ] 


when approximating Feynman Kac Formula for non linear filtering. Other authors 
app lied Log-Sobolev inequalities and concentration inequalities USiEEE 
(20|. Most of the previously cited works assume that the interaction term is of the 
form bi(x,y) = VF(x — y) and the local term is of the form 60 = VF for some F, V 
satisfying certain convexity properties. This work is essentially a generalization of 


We are motivated in particular by the application of these models to neuroscience 
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(see for instance 0,0-E3, m, 03 , HH ) although we expect in fact that these results 
are applicable in other domains such as agent-based modeling in finance, insect- 
swarms, granular models and various other applications of statistical physics. We 
have been able to weaken some of the requirements in [20I ] and other works, so that 
the results may be applied in arguably more biologically realistic contexts. We do 
not assume that the interaction term bi(x,y) is a function of x — y, as in many of 
the previously cited works. The uniform propagation of chaos result is essentially 
due to the stabilizing effect of the internal dynamics (bo term) outweighing the 
destabilizing effect of the inputs from other neurons (61 term) and the noise (62 
term). In [20|], it was assumed that the gradient of bo is always negative, and is 
at least linear. However it is not clear (at least in the context of neuroscience) 
that the decay resulting from the internal dynamics term is always this strong for 
large values of \X 3 \. Neuroscientific models are only experimentally validated over 
a finite parameter range, and therefore it is not certain how to model the dynamics 
for when the state variable Xj is very large or small. Our more abstract setup does 
not require the decay to be linear (as in for example the Wilson-Cowan model) 
for large values of \X 3 t [. indeed the decay could be sub-linear or super-linear; all 
that is required is that in the asymptotic limit the decay from bo dominates the 
destabilizing effects of b± and 62 - Another improvement of our model over 


2Q(] 


is 


that we consider multiplicative noise (i.e. 62 7 ^ !)• This is more realistic because 
we expect the noise term f () b 2 (Xi)dW J s to be of decreasing influence as \X 3 \ gets 
large. This is because one would expect in general that the system is less responsive 
to the noise when its activity is greatly elevated, since the system should be stable. 
The point is that experimentalists should have some liberty in fitting our model 
to experimental data; all that is required is that in the asymptotic limit the decay 
from bo dominates the destabilizing effects of 61 and & 2 - 
We do not delve into the details of existence and uniqueness of solutions, and so 
throughout we assume that 

Assumption 1.1 There exist unique strong solutions to (JT]) and ([2|). 

Our major result is the following uniform convergence property. 

Theorem 1.2: If As sumption [777] and the assumptions in Section^ hold, then 

there exists a constant K such that for all t > 0 


E 


h(x 3 ,xi) 


< KN 


for integers a > 1 and q > 1 . 

It is easy to show existence and uniqueness if, for example, bo, b\ and 62 are each 
globally Lipschitz. In the case of existence and uniqueness of (P), [13|, Theorem 3.6] 
provides a useful general criterion. Refer to [3] for a discussion of how to treat the 
existence and uniqueness of ([ 2 ]) in a more general case. 

Our paper is structured as follows. In Section [2] we outline the assumptions of our 
model, in Section [3] we prove Theorem 11.21 and in Section [4] we outline an example 
of a system satisfying the assumptions of Section [2j 


2. Assumptions 

The requirements outlined below might seem quite tedious. However in the next 
section we consider an application which allows us to simplify many of them. We 
split R into two domains T> and T> c . T> C R is a closed compact interval which we 







February 24, 2017 6:9 Stochastics: An International Journal of Probability and Stochastic Processes 

Uniform ’ propagation ‘ of chaos ‘ Oct' 2015 


4 

expect the system to be most of the time. Over T>, we require that the natural 
convexity of bo dominates that of b\ and b 2 . In T> c we require bounds for when the 
absolute values of the variables are asymptotically large. 

Assume that / > 0, that f(z ) = f(—z), g > 1 and f(z) = 0 if and only if z = 0. 
Suppose that for z £ U, g{z) = 1 and clearly g'(z) = 0. Write bo = —bo- Assume 
that for all x, y G R, 

f'(x - y ) (bo{x) - 6o(y)) - \f\ x - y)( b 2( x ) - My)) 2 > 0. (6) 

Assume that for all x,y G V, there exists a constant cq > 0 such that 

f'(x - y) ( bo(x ) - 6 0 (y)) - \f"( x - y)(Mx) - b 2 {y)f > C 0 f(x - y). (7) 

Assume that for all z£R, there is some a > 1 such that 

f'(z) a < /(*)■ (8) 


Assume that there exists a constant ao € R such that for all x ^ T>, 

9'(x). 


y(x. 


-b 2 (x) 


< a 0 . 


(9) 


Assume that for x <£. V, for all probability measures 7 and all y G R, 

— j-r (bo( x ) - b\(x, 7 ) - ^ b 2 {x)(b 2 (x) - b 2 {y)) - ^ 0062 ^)) > c 0 . ( 10 ) 

g{x) V f\x-y) 2 ) 


g"(x) 

g(x) 


b 2 (x) 2 < c 2 . 


( 11 ) 


Assume that there exist constants ci,ci € R such that for all x,y\,y 2 G R, 


. 2(q-l) 


g(x) a (bi{x,yi) - h(x,y 2 )) <ci 5 (yi) « g{y 2 ) « /(yi-y 2 ) “ • (12) 

l^i(yi,®) - &i(j/2,z)| <cif(yi - y 2 )M. (13) 


Assumption (fT2D might seems a little strange. If p(x) -> 00 as 1 -> 00 , in the 
context of neuroscience it would mean that the relative influence of neuron k on 
neuron j decreases as X-) 00 . This seems biologically reasonable. We assume 

that cq dominates the other terms, i.e. 


c := co — ci — c\ — c 2 > 0. (14) 

For some positive integer q > 2, we require that there exists a constant C 2 such 
that for all s > 0 , 


E [bi(x s ,j2 s y] < c 2 . 


(15) 
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Assume that there exists a constant C\ > 0 such that for all s > 0 and for all N, 


E 


■ _ 2(a-l)q 


,E 


g(x s 


2(a-l)q 


<c,. 


(16) 


3. Proof of Theorem 11.21 


We now outline the proof of Theorem 11.21 

Proof: We will prove that there exists a constant C such that 


E 


h(xi,xi) < / -cE [h(Xi,X*)\+CN-'E[h{X’,Xi)]-d8. (17) 


The theorem will then follow from the application of Lemma 13.31 to the above 
result. 

We observe using Ito’s Lemma that 


h(Xj, X 3 t ) — I\ + 1' 2 + 1% +13 +14 +15 + 


dhi , dh 7 

- b2( X i) + - b2 (X i 


dW>. (18) 


The Ij are 


h =f -g{Xi)g{Xi)f\Xi - Xi) ( bo(Xi ) - b 0 (Xi)) 

+ l -g{Xl)g(Xi)f" (Xi - Xi) (b 2 (X 3 ) - b 2 (Xi)fds (19) 

I2 = f f(Xi - Xi)g(Xi)g>(Xi)(bi(Xi,iIs) - 60(^i)) 

J 0 

- f'(Xi - Xi)g(Xi)g'(Xi)b 2 (Xi)(b 2 (Xi) - b 2 (Xi))ds (20) 

I'i=J* f(xi - xi)g\xi) g (xi) (h(xi,ii s ) - bo(xij) 

+ f'(Xi - Xi)g'(Xi)g(Xi)b 2 (Xi)(b 2 (Xi) - b 2 (Xi))ds (21) 

h = f g(Xi)g(Xi)f'(Xi - Xi) (bi(xi,p) - 61 (Xi,fl s )) ds (22) 

Jo 

h = f nxi - xi)g'(xi)g'(xi)b 2 (xi)b 2 (xi)ds ( 23 ) 

Jo 


h =\ J* f(Xi - Xi) (g"(Xi)g(Xi)b 2 (Xif + g"(Xi)g(Xi)b 2 (Xif ) ds (24) 
We start by establishing that 


E [h + I' 2 + I'1 + I 4 ] < -co f E [h(Xi , Xi)] ds. (25) 

Jo 

We prove that the sum of the integrands of I\ anc l ^4 is less than or equal to 
-coh(Xi,Xi). Suppose firstly that X 3 s ,Xg G D. Then the integrands of I 2 ,I 2 and 
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I 4 are all zero. Furthermore, using ([7|) , the integrand of I\ satisfies the bound 

- g{Xi)g{Xi)f'{Xi - Xi) (b 0 (X>) - 6 0 (^)) 

+ l -g{Xi)g{Xi)f" (Xi - Xi) (6 2 (X|) - & 2 (X|)) 2 < -c 0 h(Xi,X>). 

Now suppose that Xi £ V. The integrand of I\ is less than or equal to zero because 
of ©. Through (jUJ), 


g'(xi) 


(6o(^)-6 1 (^,^))/(^-Al)-/'(^-Al> 2 (Xi)(6 2 (Al)-6 2 (^')) 


-^b 2 (xi)f(xi-xi) 


>g(Xi)cof(Xi-Xi). (26) 


Since f(—z ) = f(z) and f'(—z ) = — f(z ), upon multiplying the above identity by 

-g(xi), 


g(xi) g '(xi) [f(xi - ximxia.) - b 0 (xi))+ 

f\xi - xi)b 2 (xi)(b 2 (xi) - b 2 (xj))\ + \j(xi - xi)g'(xi) g '(xi)b 2 (xi)b 2 (xi) 

< -coh(Xi,Xi) - \a,g(Xi)g'(Xi)b 2 (Xi)f(Xi - XJ)+ 


1 


f(Xi-Xi)g'{Xi)g'(Xi)b 2 (Xi)b 2 {Xi) < -CO h(Xi,Xi), (27) 


since by m, 

\a,g(Xi)g'(Xi)b 2 (Xi)f(Xi-Xi) - \j(Xi-Xi)g'{Xi)g'(Xi)b 2 (Xi)b 2 {Xi) > 0. 

Notice that the left hand side of (|27fl is the sum of the integrand of I' 2 and half 
of the integrand of I4. Similarly if Xi ^ V, the integrand of I\ is less than or equal 
to zero, and through ([9]) and m, 


g'(Xi)g(Xi) f(Xi 


ximxi,M-bo{xi))+ 


f\xi - xi)b 2 (xi)(b 2 (xi) - b 2 (xi))] + l ~f(xi - xi)g'(xi)g'(xi)b 2 (xi)b 2 (xi) 
< -c 0 h(Xi,Xi ) - l -a 0 g'(Xi)g(Xi)b 2 (Xi)f(Xi - X{)+ 

\j(Xi - Xi)g'(Xi)g'(Xi)b 2 (Xi)b 2 (Xi) < -c 0 h(Xi, X{). (28) 


The left hand side of the above is equal to the integrand of I 2 and half of the 
integrand of 1 4 . Observe that if Xi £ V, then the left hand side of (12711 is zero 
because g' is zero in T>. Similarly if Xi € V, then the left hand side of (12811 is zero 
because g' is zero in V. These considerations yield the bound (125|) . 
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It follows from (Jill) that 


E[/ 5 ]<c 2 / E [h(X 3 ,X 3 )]ds. 
J o 


We finish by bounding the I 3 term. Suppose that g{X 3 s ) > g(X{). Then using 
3 ), m - m an d the triangular inequality 


f'{Xi - Xi)g[Xi)g(Xi) (byX 3 s ,X k ) - b^X 3 ,^)) < 

| f'(X 3 - X 3 )\g(Xj)g(X 3 )\h(X 3 sl X k ) - h(X 3 ,X k ) 

+ \f\Xl - X 3 )\g(X 3 )g(X 3 ) \h(Xi,X%) - h(X 3 ,X* 

< ci\f\Xl - Xi)\g(Xi)giXi)fiXi - X 3 )^ 

+ | f(X 3 - Xl)\g{Xl)-«g{Xlf»g(Xif-l \h (X 3 ,X k ) - h(X 3 ,X k ) 
<hg{X 3 )g{X 3 )f{X 3 -X 3 )+ 

g(X 3 f«g(X 3 s f«f(X 3 s -Xi)-.g(Xi?~. |&i(X|, X k s ) - b^X 3 , X k ) 

< cig{X 3 )g(X 3 )f(X 3 s - X 3 )+ 


1 , — •. 1 


cif(X 3 - X 3 y a g(X 3 )tg(X 3 )tg(X*)^g(X«)^f(X« - 


rk\ — 




We obtain the same inequality when g(X 3 ) > g(X 3 s ). That is, 


f\X 3 - X 3 )g(X 3 )g(X 3 ) {byX 3 s ,X k ) - h(X 3 ,X k )) < 

| f\X{ - Xi)\g(Xi)g{Xi) \h(X 3 ,X k ) - b^X 3 ,^ 

+ \f\Xi - X 3 )\g(X 3 )g(X 3 )\h(X 3 ,X k ) - h(X 3 ,X k ) 
<b 1 g(X 3 a )g(Xi)f(Xi-Xi)+ 

cif{Xi - Xi)lg{Xi)\g{Xi)-«g{X k S^g{X k S-^f{X k s - X k )^. 


Applying Holder’s Inequality to the above, 


E 


< 


f\Xl - X 3 )g(X 3 )g(X 3 ) [b^X 3 , X k ) - h(X 3 ,X k 
crE [g(Xi)g(Xi)f{Xi-X’)] + 
ciE [g(X 3 s )g(X 3 )f(X 3 - X 3 )] - E [g(X k )g(X k )f(X k - X k )] ~ 

= (cr + cr)E [g(X 3 )g(X 3 )f(X 3 - X 3 )] 
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We use Holder’s Inequality to see that 


E 


N 


f\Xi - Xi)g{Xi)g{Xi) (**,*,*) - 

\k=1 


< 


E [f'{Xl - XiTg(Xi)g(Xj)\ • E ™ 


aq—a—q 


X 


E 


a-l^_2aq_ 


aq—a—q 


5( ^) —X — 


x E 


N 


J2h(xix^-hm, 


f^s 


l \k=l 


where q is the integer that appears in assumption (1151) . 
By Assumption (1161) . 


E 


g(X: 


- 2 (°— 


aq—a—q 


x E 


... 2(a-l)q 

g(Xl)^X 


aq—a—q 


is uniformly bounded for all s. Furthermore through Assumption (1151) and Lemma 


E2 E [(Et 1 6i(Xi,X s fc ) - bi(X J s ,jl 
Assumption (|8|). 


is bounded by £N i . Finally, using 


E [f\Xi - Xiy g {Xl)g{Xi)\ • < E [h(Xi,X J s )] 


We thus find that for some constant C, 

E [J 3 ] < C f N~h E [h(Xi,Xi)] “ ds. 

J o 

In summary, noting the assumption (1141) . we now have all the ingredients for (1171) . 

□ 

Corollary 3.1: Let l £ N* and fix l neurons (i\,... fii ) € N*. Under the assump¬ 
tions of Theorem 1, the law of (Xfi, ...,Xf), converges toward gf l for all t > 0. 

Proof: 


E 


\(Xfi,...,Xt)-(Xfi 




k= 1 



< IKN «Ca°-l) ) 


Hence Vf > 0 the law of (Xfi,..., Xf ) converges when N tends to infinity to the 
law of (Xfi,.... Xf l ) , whose law is equal to pf 1 by definition. □ 

We present now the lemmas used in the proof of Theorem 11.21 

Lemma 3.2: Suppose that (e- 7 )^L 1 are independent identically-distributed R- 
valued random variables such that E [(e- 7 ) 9 ] < oo and E[e J ] = 0. Then there exists 
a constant £ such that for all N 


E 



< CIV 9 " 1 . 
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Proof: Consider the binomial expansion of (^2^=1 e j ■ There are N q terms in 
total. The expectation of at least Nx (IV —1) x (IV —2) x ... (IV— q+1) of these must 
be zero, as the constituent factors are all independent. Let (ji ) ] , 1 < ji < N, be 
an arbitrary set of indices. Then through Holder’s Inequality, 


E 



< E [(e 1 ) 9 ] • 


Thus 


E 


N 


y e j 


< 


E [(e 1 ) 9 ] x (N q - N x (N - 1) x (N - 2) x ... (N - q + 1)) 
<E[(e 1 ) 9 ] x (N q - (N - q + 1 )«) 

= E [(e 1 ) 9 ] iV 9 - 1 ^ - N ^1 - ^ < E [(e 1 ) 9 ] iV 9 ’ 1 ^ - 1)(? - 2). 


□ 

The following lemma is an easy generalization of a result in |20]. 

Lemma 3.3: Suppose that u is continuous and satisfies, for some constants 
C, c > 0 and positive integer a > 1, for all t < T, 

f T 

u(T) — u(t) < J —cu(s) + Cu(s)“ds. 

Furthermore u( 0) = 0. Then for all t > 0 



Proof: It may be seen that u is differentiable, with the derivative satisfying 

u(t) < — cu(t ) +Cu(t)°. 


Let v(t) = u(t) exp(ct). Then 


v(t) < Cv(t)°-vex p —3^1^ . 

If v(t) = 0 then there is nothing to show. Thus we may assume that for all t > 0, 
v(t) > 0. Hence 


v(t)v(t) » < C exp 


ct{a — 1 ) 


a 
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Upon integration, 


a — 1 
Thus 


a , . o-i a C ( (ct(a — 1) 

v(t)— < -- 7 - exp ' v ’ 


a — 1 c 


1 , a C (ct(a — 1) 

1 I <-exp 1 

a — 1 c 


v(t) < ( — J exp(ct) 


□ 


4. Application 

In this section we are going to provide an example of a system satisfying the 
requirements of Section [21 so that the result of Theorem 11.21 will apply. We start 
by defining the following functions. 

For all x £ R, let f{x) := \x 2 . Let V = [—A, A] for some A O.We take a = 2 
and q = 3. Define the sigmoid function S(x) := 1 +ea ,p(_ 3 .) i it is clear that S is of 
class C°° , 0 < S(x) < 1 and its derivative is bounded and positive. Using this, we 
define 


( 1 if x € V 

g(x) = < S(—A — x) + \ if x < —A 

5(—A + x) + | if x>A 

The function g is continuous on R, 1 < g{x) < |, its derivative g' is bounded, 
negative for x < —A and positive for x > A. 

We consider a population of N neurons, with evolution equation 

1 1 N 

dV? = (—V/ + -J2 J(V/,V t k )S(V t k ) + m)dt + ami (29) 

V k=1 


where V t J is the membrane potential of neuron j, I(t ) is the deterministic input 
current. J(VlV t k ) denotes the synaptic weight from neuron k to neuron j. The 
function J : R x R —> R is assumed to be of class C 1 in both variables, such that 
both it and its derivative are bounded. 

The above assumptions are sufficient for the requirements of Section [2] to be satis¬ 
fied. In particular, using the Mean Value Theorem, one can easily verify the bounds 
[ 12 ] and [T3l Morever, one can refer to [ 2 ] and verify that assumption 1 1.1 1 is satisfied. 
It then follows, using Theorem 11.21 that for all t > 0 


E 


(V t j - V t j ) 2 


< 4KN~I, 


In other words, the law of an individual neuron converges to its limit as N —> oo 
at the time-uniform rate given above. 
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